Goto

Collaborating Authors

 unified perspective


MetricFormer: A Unified Perspective of Correlation Exploring in Similarity Learning

Neural Information Processing Systems

Similarity learning can be significantly advanced by informative relationships among different samples and features. The current methods try to excavate the multiple correlations in different aspects, but cannot integrate them into a unified framework. In this paper, we provide to consider the multiple correlations from a unified perspective and propose a new method called MetricFormer, which can effectively capture and model the multiple correlations with an elaborate metric transformer. In MetricFormer, the feature decoupling block is adopted to learn an ensemble of distinct and diverse features with different discriminative characteristics. After that, we apply the batch-wise correlation block into the batch dimension of each mini-batch to implicitly explore sample relationships. Finally, the feature-wise correlation block is performed to discover the intrinsic structural pattern of the ensemble of features and obtain the aggregated feature embedding for similarity measuring. With three kinds of transformer blocks, we can learn more representative features through the proposed MetricFormer. Moreover, our proposed method can be flexibly integrated with any metric learning framework. Extensive experiments on three widely-used datasets demonstrate the superiority of our proposed method over state-of-the-art methods.


Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

Neural Information Processing Systems

Latent-based image generative models, such as Latent Diffusion Models (LDMs) and Mask Image Models (MIMs), have achieved notable success in image generation tasks. These models typically leverage reconstructive autoencoders like VQGAN or VAE to encode pixels into a more compact latent space and learn the data distribution in the latent space instead of directly from pixels. However, this practice raises a pertinent question: Is it truly the optimal choice? In response, we begin with an intriguing observation: despite sharing the same latent space, autoregressive models significantly lag behind LDMs and MIMs in image generation. This finding contrasts sharply with the field of NLP, where the autoregressive model GPT has established a commanding presence.


Partition-wise Graph Filtering: A Unified Perspective Through the Lens of Graph Coarsening

Li, Guoming, Yang, Jian, Chen, Yifan

arXiv.org Artificial Intelligence

Filtering-based graph neural networks (GNNs) constitute a distinct class of GNNs that employ graph filters to handle graph-structured data, achieving notable success in various graph-related tasks. Conventional methods adopt a graph-wise filtering paradigm, imposing a uniform filter across all nodes, yet recent findings suggest that this rigid paradigm struggles with heterophilic graphs. To overcome this, recent works have introduced node-wise filtering, which assigns distinct filters to individual nodes, offering enhanced adaptability. However, a fundamental gap remains: a comprehensive framework unifying these two strategies is still absent, limiting theoretical insights into the filtering paradigms. Moreover, through the lens of Contextual Stochastic Block Model, we reveal that a synthesis of graph-wise and node-wise filtering provides a sufficient solution for classification on graphs exhibiting both homophily and heterophily, suggesting the risk of excessive parameterization and potential overfitting with node-wise filtering. To address the limitations, this paper introduces Coarsening-guided Partition-wise Filtering (CPF). CPF innovates by performing filtering on node partitions. The method begins with structure-aware partition-wise filtering, which filters node partitions obtained via graph coarsening algorithms, and then performs feature-aware partition-wise filtering, refining node embeddings via filtering on clusters produced by $k$-means clustering over features. In-depth analysis is conducted for each phase of CPF, showing its superiority over other paradigms. Finally, benchmark node classification experiments, along with a real-world graph anomaly detection application, validate CPF's efficacy and practical utility.


MetricFormer: A Unified Perspective of Correlation Exploring in Similarity Learning

Neural Information Processing Systems

Similarity learning can be significantly advanced by informative relationships among different samples and features. The current methods try to excavate the multiple correlations in different aspects, but cannot integrate them into a unified framework. In this paper, we provide to consider the multiple correlations from a unified perspective and propose a new method called MetricFormer, which can effectively capture and model the multiple correlations with an elaborate metric transformer. In MetricFormer, the feature decoupling block is adopted to learn an ensemble of distinct and diverse features with different discriminative characteristics. After that, we apply the batch-wise correlation block into the batch dimension of each mini-batch to implicitly explore sample relationships.


A Unified Perspective on Multiple Shooting In Differential Dynamic Programming

Li, He, Yu, Wenhao, Zhang, Tingnan, Wensing, Patrick M.

arXiv.org Artificial Intelligence

Differential Dynamic Programming (DDP) is an efficient computational tool for solving nonlinear optimal control problems. It was originally designed as a single shooting method and thus is sensitive to the initial guess supplied. This work considers the extension of DDP to multiple shooting (MS), improving its robustness to initial guesses. A novel derivation is proposed that accounts for the defect between shooting segments during the DDP backward pass, while still maintaining quadratic convergence locally. The derivation enables unifying multiple previous MS algorithms, and opens the door to many smaller algorithmic improvements. A penalty method is introduced to strategically control the step size, further improving the convergence performance. An adaptive merit function and a more reliable acceptance condition are employed for globalization. The effects of these improvements are benchmarked for trajectory optimization with a quadrotor, an acrobot, and a manipulator. MS-DDP is also demonstrated for use in Model Predictive Control (MPC) for dynamic jumping with a quadruped robot, showing its benefits over a single shooting approach.


Understanding Diffusion Models: A Unified Perspective

#artificialintelligence

Diffusion models have shown incredible capabilities as generative models; indeed, they power the current state-of-the-art models on text-conditioned image generation such as Imagen and DALL-E 2. In this work we review, demystify, and unify the understanding of diffusion models across both variational and score-based perspectives. We first derive Variational Diffusion Models (VDM) as a special case of a Markovian Hierarchical Variational Autoencoder, where three key assumptions enable tractable computation and scalable optimization of the ELBO. We then prove that optimizing a VDM boils down to learning a neural network to predict one of three potential objectives: the original source input from any arbitrary noisification of it, the original source noise from any arbitrarily noisified input, or the score function of a noisified input at any arbitrary noise level. We then dive deeper into what it means to learn the score function, and connect the variational perspective of a diffusion model explicitly with the Score-based Generative Modeling perspective through Tweedie's Formula. Lastly, we cover how to learn a conditional distribution using diffusion models via guidance.


A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search

Dam, Tuan, D'Eramo, Carlo, Peters, Jan, Pajarinen, Joni

arXiv.org Artificial Intelligence

Monte-Carlo Tree Search (MCTS) is a class of methods for solving complex decision-making problems through the synergy of Monte-Carlo planning and Reinforcement Learning (RL). The highly combinatorial nature of the problems commonly addressed by MCTS requires the use of efficient exploration strategies for navigating the planning tree and quickly convergent value backup methods. These crucial problems are particularly evident in recent advances that combine MCTS with deep neural networks for function approximation. In this work, we propose two methods for improving the convergence rate and exploration based on a newly introduced backup operator and entropy regularization. We provide strong theoretical guarantees to bound convergence rate, approximation error, and regret of our methods. Moreover, we introduce a mathematical framework based on the use of the $\alpha$-divergence for backup and exploration in MCTS. We show that this theoretical formulation unifies different approaches, including our newly introduced ones, under the same mathematical framework, allowing to obtain different methods by simply changing the value of $\alpha$. In practice, our unified perspective offers a flexible way to balance between exploration and exploitation by tuning the single $\alpha$ parameter according to the problem at hand. We validate our methods through a rigorous empirical study from basic toy problems to the complex Atari games, and including both MDP and POMDP problems.


Unified Perspective on Probability Divergence via Maximum Likelihood Density Ratio Estimation: Bridging KL-Divergence and Integral Probability Metrics

Kato, Masahiro, Imaizumi, Masaaki, Minami, Kentaro

arXiv.org Machine Learning

This paper provides a unified perspective for the Kullback-Leibler (KL)-divergence and the integral probability metrics (IPMs) from the perspective of maximum likelihood density-ratio estimation (DRE). Both the KL-divergence and the IPMs are widely used in various fields in applications such as generative modeling. However, a unified understanding of these concepts has still been unexplored. In this paper, we show that the KL-divergence and the IPMs can be represented as maximal likelihoods differing only by sampling schemes, and use this result to derive a unified form of the IPMs and a relaxed estimation method. To develop the estimation problem, we construct an unconstrained maximum likelihood estimator to perform DRE with a stratified sampling scheme. We further propose a novel class of probability divergences, called the Density Ratio Metrics (DRMs), that interpolates the KL-divergence and the IPMs. In addition to these findings, we also introduce some applications of the DRMs, such as DRE and generative adversarial networks. In experiments, we validate the effectiveness of our proposed methods.